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ABSTRACT 


A method  based  on  repeated  quadratic  interpolation  is  proposed  for  satisfying 
a weak  line  search  criterion  given  by  A.  A.  Goldstein  and  it  is  shewn  to  be  effi- 
cient and  to  have  guaranteed  termination.  An  extension  of  the  line  search  criterion 
appropriate  to  minimum  norm  problems  is  sketched. 


AMS(MOS)  Subject  Classif ications : 65D05,  65H10,  65K05 

Key  Words:  Unconstrained  minimization.  Nonlinear  equations.  Descent  algorithms. 
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SIGNIFICANCE  AND  EXPLANATION 


Many  algorithms  for  the  minimization  of  functions  use  iterative  procedures 
that  first  choose  a search  direction,  and  then  decide  how  far  to  proceed  along 
the  search  direction.  These  methods  rely  heavily  on  the  availability  of  a standard 
procedure  to  control  the  length  of  step  taken  at  each  iteration.  This  report 
describes  one  possible  such  procedure  which  is  both  efficient  and  has  guaranteed 
termination.  It  also  has  the  advantage  that  it  requires  only  function  values  to 
carry  out  the  line  search , although  it  applies  in  general  only  to  methods  which 
compute  derivatives  in  estimating  the  search  direction. 

Computational  experience  has  been  very  satisfactory  over  a number  of  years  , 
and  on  average  little  more  than  one  function  value  needs  to  be  computed  at  each 
step.  Also  the  basic  ideas  of  the  line  search  test  have  been  adapted  to  minimum 
norm  problems  of  which  one  important  particular  case  is  Newton's  method.  Here, 
provided  the  search  directions  are  bounded  and  the  singularities  of  the  Jacobian 
isolated,  then  limit  points  of  the  iteration  are  solutions  to  the  nonlinear  sys- 
tem. The  boundedness  condition  on  the  search  direction  can  not  be  relaxed. 


The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive  summary 
lies  with  MFC,  and  not  with  the  author  of  this  report. 
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AN  EFFICIENT  WEAK  LINE  SEARCH  WITH  GUARANTEED  TERMINATION 


which  are  satisfied  if  the  step  achieves  a significant  reduction  in  the  function  value.  The 
first  kind  of  line  search  fitted  well  with  the  developstent  of  conjugate  direction  algorithms 
where  conceptually  important  theoretical  information  is  available  when  the  exact  minimum  is 
computed.  In  general  the  approach  used  in  these  methods  is  to  fit  an  interpolating  function 
(typically  a quadratic  or  cubic  polynomial)  to  the  most  recent  information,  compute  the  mini- 
mum of  this  interpolant,  and  use  this  as  a new  estimate  of  the  minimum.  Methods  of  this  kind 
are  discussed  in  [1] , and  recently  Robinson  [7]  has  pointed  out  problems  with  repeated  quadra- 
tic interpolation.  However,  these  methods  axe  usually  significantly  less  efficient  in  the 
sense  that  they  force  more  function  values  over  all  than  methods  of  the  second  type  (the  weak 
line  search  methods).  Our  purpose  here  is  to  give  an  efficient  implementation  with  guaranteed 
termination  for  satisfying  a weak  line  search  criterion  due  to  Goldstein  [3] . 

To  specify  this  criterion  let  F:Rn  + Rj  be  the  function  to  be  minimized.  At  the 
current  point  f the  descent  algorithm  generates  the  vector  )i  determining  the  descent  step. 
It  is  assumed  that  is  downhill  which  means  that  ai  > 0 such  that 

7F(x)h  < - «||VF(x)  ||  || h ||  (1.1) 

yx  e R where  R is  a region  containing  the  successive  iterates,  also  that  F is  bounded 

2 

below  on  R,  and  that  F e C (R) . The  exact  specification  of  the  norm  in  (1.1)  is  not  im- 
portant. Let 
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<l>(x,  h,  X) 


F(x)  - F(x  + Xh) 
X7F (x) h 


(1.2) 


Initially  we  try  a preset  value  XQ  (although  there  would  likely  also  be  a test  on  ||h||  to 

ensure  that  X Jt  is  not  too  large) , and  accept  this  if 
0~ 

0 < o <_  >Mx,  t>»  X)  (1.3) 

with  X ■ X-.  If  this  test  fails  we  seek  a X satisfying  both  (1.3)  and 

iMx,  h,  X)  < 1 - o . (1.4) 

-4 

Clearly  we  must  choose  o < 1/2  and  usually  it  is  taken  small  (say  10  ) . 

Remark  (i) . It  is  important  to  note  that  the  Goldstein  procedure  is  not  usually  appropriate 
if  derivatives  of  F are  not  available  unless  an  alternative  to  a difference  calculation  is 
available  to  compute  VF(x)h  (an  example  is  indicated  in  section  4.). 

(ii)  The  significance  of  i>  can  be  seen  by  noting  that  it  relates  the  decrease  in  F in  the 
step  X to  the  decrease  that  would  be  observed  if  F were  linear.  The  Goldstein  test  imposes 
a sense  in  which  these  two  quantities  are  comparable.  For  this  reason  it  is  not  surprising 
that  it  has  proved  very  useful  both  in  the  analysis  and  implementation  of  algorithms  using 
function  and  first  derivative  information. 

2.  Properties  of  the  Goldstein  test. 

The  basic  assumption  made  on  F is  that  it  can  be  expanded  in  the  form 

2 2 1 

F(x  + Xh)  - F (x)  + XVF(x)h  + X ||h||  / (l-s)F"(x  + (Xh)s)ds  (2.1) 

0 

where  the  ' indicates  differentiation  in  the  direction  defined  by  h which  is  assumed 
bounded.  We  now  give  two  key  properties  which  follow  from  the  form  of  the  test  and  which  are 
basic  to  its  usefulness. 

(i)  For  each  x,  and  for  h satisfying  (1.1),  then  either  VF(x)h  - 0 or  3 X such 
that  either  X - XQ  satisfies  (1.3)  or  3X  satisfies  (1.3)  and  (1.4). 

(ii)  For  any  algorithm  producing  directions  satisfying  (1.1)  applied  to  F where  F 
has  bounded  second  derivatives  in  R then  limit  points  of  the  sequence  of  iterates 
{x^}  are  points  at  which  VF(x)h  ” 0. 

-2- 


To  show  (i)  we  have 


*(x,  h,  X)  - 1 - -.-yfrj!  - / (l-s)F"  (x  + (Xh)s)ds 


-*  1,  X •*  0 for  fixed  x,  h . 

To  shew  (ii)  note  that  if  the  sequence  of  step  lengths  {X^  satisfies  {X^  ^ X > 0 then 
(1.3)  implies  that 

- 7F(x. )h.  < {F(x  ) - Ftx.^,)}  (2.3) 

-i  — \o  -i  -i+i 

and  the  result  follows  from  this  as  the  sequence  (F(x^)  - F(xi+1>)  is  decreasing  and  bounded 
be  lew.  it  follows  from  (2.2)  that  X^  > 0 for  any  particular  x^,  h^.  Thus  X = 0 if  and 

only  if  3 subsequence  {X  ) + 0.  It  is  the  second  test  (1.4)  which  ensures  that  X does 

vi 

not  get  too  small  in  the  set  of  allowable  values  for  each  x,  h.  Here  it  gives  (using  (2.2)) 

NJISJI  i 

1 _ • T f i y — \ nN  / ..  • <1  L t ^ 1 _ 


which  implies  that 


1 - | F*  (x^~)  | / <1-«>F"(?vi  + 1 1 - 


*v  "tv  II  1 

| F*  (x  )|  < — — I (1-S)F"(X  + (X  h )s)ds 

'wi  0 0 "vi  ^I'^t 


and  the  result  follows  from  this  and  the  assumed  boundedness  of  ||h  || . 

— The  conditions  under  which  7F(x)h  - 0 «*  x is  a stationary  point  of  F require  a deeper 
study  of  the  particular  descent  algorithm.  However,  if  the  downhill  condition  holds  then  this 
implies  either  ||7F(x)||  - 0 or  ||h||  • 0,  and  the  problem  is  reduced  to  ruling  out  cases  in 
which  the  second  condition  holds  and  ||VF(x)  ||  / 0.  One  class  of  methods  for  which  this  can 
be  done  simply  is  the  class  related  to  steepest  descent  in  the  sense  that  the  downhill  condi- 
tion can  be  stated  in  the  form 

7F(x)h  < - j ||7F(x)||2  . (2.5) 

Clearly  limit  points  are  points  at  which  || 7F(x)  ||  - 0 in  this  case. 
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3.  A line  search  strategy. 

In  this  section  a procedure  for  satisfying  (1.3)  and  (1.4)  is  given  which  is  both  effi- 
cient  and  has  guaranteed  termination.  The  procedure  is  as  follows.  If  at  the  current  stage 
of  testing  X fails  either  (1.3)  or  (1.4)  with  k > 0 then: 

(i)  If  Xk  fails  (1.3)  so  that 

F(x  + X. h)  > F(x)  - aXj|h||F'  (x)  > F(x)  (3.1) 


then  form  the  quadratic  interpolation  polynomial 


♦fc(X)  - F(x)  + x||h||F'(x)  + X2||h||2Ck 


where  is  determined  by 


*k(V  “ F(?  + Xk-’ 


giving 


Ck  - (F(x  + Xfch)  - F(x)  - Xk||h||F,(x))/X2||h||2 
| F* (x)  | 

“ X.  Ilhll  U " Xkn  * 


X.  is  now  found  by  minimizing  4 (X),  and  this  gives 
)c+l  ^ 

K+y  “ |F*  (x)  |/2Ck||h||  . 


(ii)  If  Xk  fails  (1.4)  with  k > 0 then  Xk+1  is  determined  by  applying  a step  of 

the  secant  algorithm  to  i|i(x,  h.  X)  - 1/2  using  Xk  and  the  smallest  value  of  X 
which  has  previously  failed  (1.3). 

To  show  that  termination  is  guaranteed  we  note  that  in  the  second  part  of  the  method  we 
are  applying  a standard  root  finding  algorithm  in  a well  behaved  situation  and  with  a tolerance 
of  1/2  -a  on  the  function  value.  Thus  we  concentrate  on  the  first  part  and  note  that  it  is 
only  necessary  to  show  that  the  Xk  are  decreasing  fast  enough  for  then  (1.3)  will  be  satis- 
fied in  a finite  nusfcer  of  steps  for  each  x,  h.  We  have  from  (3.3)  and  (3.4)  that 


I 


k+1  2(1  - i(i<x,  h,  Xfc) ) 


2(1  - o)  ,J-S> 

if  the  test  (1.3)  fails  for  X **  X^.  Thus  the  worst  possible  situation  is  one  in  which  the 

procedure  finds  a value  of  X small  enough  to  satisfy  (1.3)  by  essentially  repeated  bisection. 
Note  that  by  (3.3)  and  (3.4)  and  X^+1  are  well  determined  quantities  in  this  phase  of 

the  algorithm. 

To  show  that  the  use  of  quadratic  interpolation  is  efficient  we  write  <|i(x,  h,  *k+1>  in 


the  form 


*(x,  h,  Xfc+1) 


xk+llll?llf"t?>  - Xk+1Hl?l!2/0  <l-s)F”(x  + (X  h)s)ds 


k+1"  - J0  ' 

- Xk+1||h||F'(x) 


" 1 ■ |F*(x) I f (1-s) F" (x  + (Xk+1h)s)ds 


, 1 

1 ~ 2C^  / (Is)F-(x  * (Xk+1h)s)ds 


(3.6)  _ 


1 - i -2— 
2 1 


/ (l-s)F”(x  + (X  h)s)ds 


/ (1-s) F" (x  ♦ (X  hlslds 
0 - k- 


or,  introducting  mean  values  n.  , 


V ?k+l' 


K»(x  ♦ W 


Vl*  “ 1 “ 2 FH (x  + 


3k>  ’ 


This  shears  that  if  F is  quadratic  then  i (i(x,  h,  X^)  “ 1/2  so  that  never  more  than  one 
iteration  is  required  in  this  case.  The  integral  formula  suggests  that  the  local  behaviour 
of  F"  will  have  to  be  fairly  extreme  before  (1.3)  fails  repeatedly.  For,  if  the  mean  values 
can  be  estimated  by  the  corresponding  ordinates  for  the  1 point  Gauss  rule  with  (1-s)  as 
weight  function  then  this  gives  " y X^h . Thus  each  iteration  significantly  decreases  the 
effective  interval  length  and  increases  strongly  the  chance  that  the  local  approximating  qua- 
dratic will  be  adequate. 
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I.  An  application  to  minimum  norm  problems. 

Among  generalizations  of  the  Goldstein  test  one  of  the  more  interesting  has  been  to  mini- 
mum  norm  problems  IS)  • Let  ||  • ||  denote  a vector  norm  and  set 

F(x)  - ||f(x>||  M.l) 

with  dim  f « n _»  p - dim  x.  The  analogue  of  the  Gauss-Newton  algorithm  generates  a descent 
direction  by  solving  the  linear  subproblem 


min  ||  r || , r - Vf(x)h  + f(x) 
h 


(4.2) 


To  generalize  the  Goldstein  test  we  again  compare  the  reduction  in  F(x)  with  the  reduction 
achieved  in  the  linear  case,  and  this  yields  a procedure  valid  even  if  11*11  is  not  con- 
tinuously differentiable  (the  case  of  polyhedral  norms  for  example).  Thus  we  consider 


. ..  I|f(x)||  -||f(x  + M>>|| 

- ’ “ X{  ||  f (x)||  - min  || r ||  ) 

h 


(4.3) 


Provided  ||h||  is  bounded  (this  is  not  guaranteed)  then  the  Goldstein  test  forces  convergence 
to  a point  at  which 


||f(x)  ||  - min  || r || 
h 


(4.4) 


and  such  points  are  appropriately  called  stationary  points  [5] . 

Specialising  the  norm  to  the  tj  norm  it  is  convenient  to  take  F as  ||  f || 2 and  to 
make  the  obvious  changes  to  (4.3).  In  this  case  the  result  corresponds  exactly  to  (1.2).  We 
have  w 


VF(x)  - 2f (x)  Vf(x) 


(4.5) 


and 


h - - Vf(x)  f(x) 


(4.6) 


so  that  a limit  point  x of  this  variant  of  the  Gauss-Newton  algorithm  is  a point  such  that 


0 - lim  VF(xi>hi  - lim  -2f (j^)  Vf (x^*  ftx^ 
!-*•  * " i-**» 


- lim  -2||p(xi)f(xl)|| 
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where  PU^  is  the  projector  onto  the  range  space  of  Vf  (j^) . Provided  || h± ||  is  bounded 
it  follows  that  if  n - p and  Vffjr)  has  full  rank  for  each  i then  Plx^  = I and 
0 “ ^lim  ||f(?i)  ||.  Thus  the  Newton  algorithm  is  convergent  to  a solution  even  if  x is  am 

isolated  singular  point  of  Vf  (x) . 


Remark  (i) . In  this  case  the  condition  ||h||  bounded  is  necessary.  For  consider  f * 1 + x . 
We  have  Vf  - 2x  and  h - -(1  ♦ x2)/2x.  Thus  * - 1 - a(^L_  ) and  x x(l  - *)  if 


,4x 


l*x 


The  Goldstein  test  is  satisfied  if  4(l-o)  > y >_  4<J,  and  the  resulting  sequence 
of  x iterates  converges  to  zero  which  minimizes  ||  1 + x2||.  This  is  a case  in  which 
{A^}  -*  0 and  the  argument  of  section  2 shows  that  indeed  F'  -*•  0 as  Ah  = ^ x -*■  0.  However 


VFh  • -(1  ♦ x )¥*0.  Note  that  min  ||r| 

h 

||f||  • min  || r ||  is  satisfied  at  the  minimum  although  ||  f 
h 


is  discontinuous  at  x « 0 so  that 


min  ||r  ||  = ||f||  = 1 + 
h 


WflO. 

(ii)  It  is  readily  verified  that  in  the  region  of  second  order  convergence  the  choice  XQ  = 1 
will  satisfy  (1.3). 

(iii)  Note  that  (4.3)  can  be  applied  in  cases  in  which  derivatives  are  not  available  provided 
care  is  taken  to  ensure  that  h is  downhill.  For  example,  if  quasi  Newton  updates  are  used 

to  estimate  Vf(x)  then  we  can  show  convergence  to  a stationary  point  x*  of  F(x)  provided 

* 

the  appropriate  Jacobian  converges  to  Vf(x  ).  An  algorithm  of  this  kind  has  been  realized 
by  using  a method  of  special  iterations  due  to  Powell  [6]  to  force  this  convergence. 


5.  Numerical  experience. 

The  procedure  described  here  for  satisfying  the  tests  (1.3)  and  (1.4)  has  been  coded  in 
an  implementation  of  Fletcher's  1970  algorithm  for  the  minimization  of  an  unconstrained  func- 
tion using  conjugate  direction  methods  [2]  made  by  Michael  Saunders  and  the  author.  This  al- 
gorithm is  described  in  [4).  It  used  product  form  updating  of  the  Hessian  estimate  which  was 
stored  as  the  upper  triangular  factorization  of  its  Choleski  decomposition.  The  resulting 
algorithm  has  been  in  regular  use  as  the  main  unconstrained  optimization  subroutine  using  de- 
rivative information  in  the  subroutine  library  at  the  Australian  National  University  for  some 
six  years.  The  line  search  has  proved  a very  satisfactory  part  of  the  algorithm,  and  it  seems 


-7- 


I 

S 

! 

i 


\ 


I 


most  unusual  for  more  than  one  or  two  function  evaluations  to  be  made  in  this  part  of  the  cal- 
culation. It  is  unusual  for  the  test  (1.4)  to  fail  once  (1.3)  has  succeeded  for  the  first  time. 
However,  it  does  happen  occasionally  but  the  strategy  based  on  the  secant  algorithm  handles 
this  contingency  very  adequately.  One  of  the  considerations  in  writing  the  algorithm  in  the 
first  place  was  that  it  be  suitable  for  minimizing  barrier  and  penalty  objective  functions, 
and,  presumably,  these  have  provided  it  with  quite  a severe  test. 


references 


11]  Brent,  R.  P.:  Algorithms  for  Minimization  without  Derivatives,  Prentice-Hall,  1973. 

[2]  Fletcher,  R.:  A new  approach  to  variable  metric  algorithms.  Computer  J.,  13,  p.  317-322, 

1970. 

[3]  Goldstein,  A.  A.:  On  steepest  descent,  SIAM  J.  Control,  3,  p.  147-151,  1965. 

[4]  Osborne,  M.  R. : Topics  in  optimization,  report  CS  279,  Computer  Science  Department, 
Stanford,  1972. 

[5]  Osborne,  M.  R.  and  Watson,  G.  A.:  Nonlinear  approximation  problems  in  vector  norms, 

Numerical  Analysis  (ed.  G.  A.  Watson),  Lecture  Notes  in  Mathematics  no.  630,  Springer- 

Verlag,  1978. 

[6]  Powell,  M.  J.  D.:  A FORTRAN  subroutine  for  unconstrained  minimization  requiring  first 
derivatives  of  the  objective  function.  Dept.  R 6469,  AERE  Harwell,  1970. 

[7]  Robinson,  S.  M.:  Quadratic  interpolation  is  risky,  MRC  Technical  Summary  Report  (*1839, 
University  of  Wisconsin-Madison,  1978. 


SECURITY  CLASSIFICATION  OF  THIS  PACE  (When  Die  Entered) 


1 REPORT  DOCUMENTATION  PAGE 

READ  INSTRUCTIONS 

BEFORE  COMPLETING  FORM 

|| 

S.  RECIPIENT'S  CATALOG  NUMBER 

4.  TITLE  (end  Subtitle) 

AN  EFFICIENT  WEAK  LINE  SEARCH  WITH  GUARANTEED 
TERMINATION 

S.  TYPE  OF  REPORT  4 PERIOD  COVEREO 

Summary  Report  - no  specific 
reporting  period 

4.  PERFORMING  ORG.  REPORT  NUMBER 

7.  AUTHORS) 

M.  R.  Osborne 

4.  CONTRACT  OR  GRANT  NUMBERf.J 

DAAG29-7 5-C-0024  ^ 

B.  PERFORMING  ORGANIZATION  NAME  AND  ADDRESS  yS 

Mathematics  Research  Center,  University  of  ' 

610  Walnut  Street  Wisconsin 

Madison.  Wisconsin  53706 

10.  PROGRAM  ELEMENT,  PROJECT,  TASK 
AREA  4 WORK  UNIT  NUMBERS 

Work  Unit  Number  7 - 
Numerical  Analysis 

II.  CONTROLLING  OFFICE  NAME  AND  ADDRESS 

U.  S.  Army  Research  Office 

P.O.  Box  12211 

Research  Triangle  Park,  North  Carolina  27709 

12.  REPORT  DATE 

August  1978 

IS.  NUMBER  OF  PAGES 

9 

14.  MONITORING  AGENCY  NAME  4 ADDRESSflf  dlllorent  from  Controlling  Olll co) 

IS.  SECURITY  CLASS,  (of  thfm  roport) 

UNCLASSIFIED 

is*.  declassification/downgrading 
SCHEDULE 

IS.  DISTRIBUTION  STATEMENT  (of  t hi*  Roport) 

Approved  for  public  release;  distribution  unlimited. 

17.  DISTRIBUTION  STATEMENT  ( ot  the  mbmtrmet  entered  In  Block  20.  II  dlllerent  from  Report) 


It.  SUPPLEMENTARY  NOTES 


t*.  KEY  WOROS  (Continue  on  renee  tide  II  nocoommrr  mnd  Identify  by  block  niaibor) 

Unconstrained  minimization,  Nonlinear  equations.  Descent  algorithms.  Line 
search.  Interpolation. 


20. 


ABSTRACT  (Continue  ot,  .once  tide  II  neceee err  end  Identify  by  block  number) 

A method  based  on  repeated  quadratic  interpolation  is  proposed  for  satis- 
fying a weak  line  search  criterion  given  by  A.  A.  Goldstein  and  it  is  shown 
to  be  efficient  and  to  have  guaranteed  termination.  An  extension  of  the  line 
search  criterion  appropriate  to  minimum  norm  problems  is  sketched. 


